AITopics | video clips

Collaborating Authors

video clips

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition

Jinzhuo Wang, Wenmin Wang, xiongtao Chen, Ronggang Wang, Wen Gao

Neural Information Processing SystemsMar-23-2026, 12:13:46 GMT

Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer.

artificial intelligence, machine learning, recognition, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

EEG2Video: Towards Decoding Dynamic Visual Perception from EEG Signals

Neural Information Processing SystemsMar-21-2026, 09:46:41 GMT

Our visual experience in daily life are dominated by dynamic change. Decoding such dynamic information from brain activity can enhance the understanding of the brain's visual processing system. However, previous studies predominately focus on reconstructing static visual stimuli. In this paper, we explore to decode dynamic visual perception from electroencephalography (EEG), a neuroimaging technique able to record brain activity with high temporal resolution (1000 Hz) for capturing rapid changes in brains. Our contributions are threefold: Firstly, we develop a large dataset recording signals from 20 subjects while they were watching 1400 dynamic video clips of 40 concepts.

artificial intelligence, name change, proceedings, (8 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (0.99)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

EEG2Video: Towards Decoding Dynamic Visual Perception from EEG Signals

Neural Information Processing SystemsFeb-16-2026, 07:26:12 GMT

Our visual experience in daily life are dominated by dynamic change.

information, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

ViLCo-Bench: VIdeo Language COntinual learning Benchmark Tianqi Tang

Neural Information Processing SystemsFeb-16-2026, 05:18:27 GMT

For what purpose was the dataset created? To address this, we propose ViLCo-Bench. Who created the dataset(e.g., which team, research group) and on behalf of which Who funded the creation of the dataset? What do the instances that comprise the dataset represent (e.g., documents, photos, What data does each instance consist of? Is there a label or target associated with each instance?

artificial intelligence, dataset, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales (0.05)
Europe > United Kingdom > Wales (0.04)

Industry:

Government (0.95)
Information Technology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Appendix MiraData A Large Scale Video with Long Durations and Structured Captions

Neural Information Processing SystemsFeb-13-2026, 21:40:54 GMT

We list the filtering criteria in Tab. 1.

artificial intelligence, machine learning, video, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > China > Hong Kong (0.04)

Industry: Media (0.95)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

40b60852a4abdaa696b5a1a78da34635-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 22:46:47 GMT

encoder, mavil, video, (16 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Germany > Saxony > Dresden (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

6e566c91d381bd7a45647d9a90838817-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-9-2026, 16:35:30 GMT

animal pose estimation, dataset, pose estimation, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.93)

Industry:

Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

5bd529d5b07b647a8863cf71e98d651a-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 20:53:25 GMT

Kinetics-400 [1] is a large scale action recognition dataset with trimmed video clips of around 10-second durations. It is collected from realistic YouTube videos, which covers 400 categories of human activities. In total, it contains around240K training videos and20K validation videos. Specifically whentraining Kinetics-200/-400 from scratch, we adopt the cosine schedule of learning rate decaying with an initiallearningrateof0.1. The initial learning rate is 0.005anddecaysby 0.1atepoch20and40.

artificial intelligence, category, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings

Zhang, Jiajie, Schwertfeger, Sören, Kleiner, Alexander

arXiv.org Artificial IntelligenceDec-2-2025

W e present a novel unsupervised framework to unlock vast unlabeled human demonstration data from continuous industrial video streams for Vision-Language-Action (VLA) model pre-training. Our method first trains a lightweight motion tokenizer to encode motion dynamics, then employs an unsupervised action segmenter leveraging a novel "Latent Action Energy" metric to discover and segment semantically coherent action primitives. The pipeline outputs both segmented video clips and their corresponding latent action sequences, providing structured data directly suitable for VLA pre-training. Evaluations on public benchmarks and a proprietary electric motor assembly dataset demonstrate effective segmentation of key tasks performed by humans at workstations. Further clustering and quantitative assessment via a Vision-Language Model confirm the semantic coherence of the discovered action primitives. T o our knowledge, this is the first fully automated end-to-end system for extracting and organizing VLA pre-training data from unstructured industrial videos, offering a scalable solution for embodied AI integration in manufacturing.

action primitive, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2511.21428

Country: Asia > China (0.28)

Genre:

Research Report (0.64)
Workflow (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Filters

Collaborating Authors

video clips

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition

EEG2Video: Towards Decoding Dynamic Visual Perception from EEG Signals

EEG2Video: Towards Decoding Dynamic Visual Perception from EEG Signals

ViLCo-Bench: VIdeo Language COntinual learning Benchmark Tianqi Tang

Appendix MiraData A Large Scale Video with Long Durations and Structured Captions

40b60852a4abdaa696b5a1a78da34635-Supplemental-Conference.pdf

6e566c91d381bd7a45647d9a90838817-Paper-Datasets_and_Benchmarks.pdf

1df493ec1c2530c038d94d7300b5b368-Paper-Datasets_and_Benchmarks_Track.pdf

5bd529d5b07b647a8863cf71e98d651a-Supplemental.pdf

From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings